Add `allow_partial` #1512

samuelcolvin · 2024-10-30T18:58:20Z

See pydantic/pydantic#10748 for details.

Selected Reviewer: @sydney-runkle

codspeed-hq · 2024-10-30T19:05:31Z

CodSpeed Performance Report

Merging #1512 will degrade performances by 16.62%

_{Comparing allow_partial (b1226d5) with main (5e95c05)}

Summary

❌ 1 (👁 1) regressions
✅ 154 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`allow_partial`	Change
👁	`test_frozenset_of_ints_duplicates_core`	144.2 µs	173 µs	-16.62%

sydney-runkle

Surprisingly minimal changes here given the new functionality added - cool!

I'm a bit confused about the naming around enumerate_last_partial - perhaps we could bikeshed just a bit here to make the purpose of this function more clear?

python/pydantic_core/_pydantic_core.pyi

sydney-runkle · 2024-10-31T17:44:51Z

tests/validators/test_allow_partial.py

+from pydantic_core import SchemaValidator, ValidationError, core_schema
+
+
+def test_list():


Should we just parametrize here and have a test for validation fails and validation successes?

(same questions below)

Otherwise, tests lgtm

sydney-runkle · 2024-11-01T01:05:45Z

Curious to hear what @davidhewitt thinks about the way this is implemented - overall, looks good to me, I think having things centralized around the allow_partial in the state makes sense, and having a supports_partial feature on each validator (that defaults to false) seems pretty intuitive as well.

It's interesting to me the ways in which fail_fast and allow_partial are sometimes tied together. Perhaps adding documentation in a similar section would be helpful for users.

davidhewitt

Overall functionally looks like it's heading in the right direction, but I think there can be some simplifications. I also have a bunch of questions which hint at edge cases.

python/pydantic_core/_pydantic_core.pyi

src/input/input_abstract.rs

davidhewitt · 2024-11-01T09:21:10Z

src/input/input_python.rs

+            Self::Dict(dict) => dict.keys().iter().last(),
+            Self::Mapping(mapping) => mapping.keys().ok()?.iter().ok()?.last()?.ok(),


NB both of these are potentially inefficient; .keys() creates a new PyList of all the keys.

yup, I don't love it, but at least it's only called when allow_partial=true.

Is there a more efficient way? (especially for dicts?)

Long term, if we move to RustModel and iterating over dicts when building typeddicts, we should be able to minimise use of this.

.call_method0("keys").iter().last() might be better (Python keys iterator rather than list), but I think it still sucks. Can't get better until we iterate, as you say.

I tested by adding:

#[pyfunction] pub fn mapping_last_key_a<'a>(mapping: &'a Bound<'a, PyMapping>) -> Option<Bound<'a, PyAny>> { mapping.keys().ok()?.iter().ok()?.last()?.ok() } #[pyfunction] pub fn mapping_last_key_b<'a>(mapping: &'a Bound<'a, PyMapping>) -> Option<Bound<'a, PyAny>> { mapping.call_method0(intern!(mapping.py(), "keys")).ok()?.iter().ok()?.last()?.ok() }

and tested with

from typing import Mapping import timeit from pydantic_core import _pydantic_core class MyMapping(Mapping): def __init__(self, d): self._d = d def __getitem__(self, key): return self._d[key] def __iter__(self): return iter(self._d) def __len__(self): return len(self._d) mapping = MyMapping({str(i): i for i in range(100)}) v = _pydantic_core.mapping_last_key_a(mapping) assert v == '99', v v = _pydantic_core.mapping_last_key_b(mapping) assert v == '99', v def run_bench(func): timer = timeit.Timer( "func(mapping)", setup="", globals={"func": func, "mapping": mapping} ) n, t = timer.autorange() iter_time = t / n # print(f'{func.__module__}.{func.__name__}', iter_time) return int(iter_time * 1_000_000_000) print(f'mapping_last_key_a: {run_bench(_pydantic_core.mapping_last_key_a)}ns') print(f'mapping_last_key_b: {run_bench(_pydantic_core.mapping_last_key_b)}ns')

Output:

mapping_last_key_a: 2487ns mapping_last_key_b: 1918ns

Outcome: they're both pretty slow, but your suggestion is slightly faster.

davidhewitt · 2024-11-01T09:26:19Z

src/validators/validation_state.rs

+    pub fn enumerate_last_partial<'i, I>(
+        &self,
+        iter: impl Iterator<Item = I> + 'i,
+    ) -> Box<dyn Iterator<Item = (usize, bool, I)> + 'i> {
+        if self.allow_partial {
+            Box::new(EnumerateLastPartial::new(iter))
+        } else {
+            Box::new(iter.enumerate().map(|(i, x)| (i, false, x)))
+        }
+    }


Rather than forcing dynamic dispatch here, it might be better to use .peekable() at the callsites to be able to check if the current iteration was the last as part of the error pathway.

great idea!

peekable made things worse, but I got rid of the Box dyn in 47f1c15

and it helped quite a lot locally - comparing main to 47f1c15:

┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Group ┃ Benchmark ┃ Before (µs/iter) ┃ After (µs/iter) ┃ Change ┃ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩ │ List │ list_of_ints_core_py │ 29.17 │ 30.09 │ +3.17% │ ├─────────────┼─────────────────────────────────────┼────────────────────┼───────────────────┼──────────┤ │ List JSON │ list_of_ints_core_json │ 42.49 │ 43.57 │ +2.55% │ ├─────────────┼─────────────────────────────────────┼────────────────────┼───────────────────┼──────────┤ │ Set │ set_of_ints_core │ 52.26 │ 52.81 │ +1.06% │ │ │ set_of_ints_core_duplicates │ 35.59 │ 35.51 │ -0.22% │ │ │ set_of_ints_core_length │ 53.59 │ 55.16 │ +2.94% │ ├─────────────┼─────────────────────────────────────┼────────────────────┼───────────────────┼──────────┤ │ Set JSON │ set_of_ints_core_json │ 56.31 │ 58.40 │ +3.70% │ │ │ set_of_ints_core_json_duplicates │ 44.14 │ 45.07 │ +2.12% │ ├─────────────┼─────────────────────────────────────┼────────────────────┼───────────────────┼──────────┤ │ FrozenSet │ frozenset_of_ints_core │ 16.60 │ 17.26 │ +3.99% │ │ │ frozenset_of_ints_duplicates_core │ 13.78 │ 14.78 │ +7.20% │ ├─────────────┼─────────────────────────────────────┼────────────────────┼───────────────────┼──────────┤ │ Dict │ dict_of_ints_core │ 85.74 │ 86.36 │ +0.73% │ ├─────────────┼─────────────────────────────────────┼────────────────────┼───────────────────┼──────────┤ │ Dict JSON │ dict_of_ints_core_json │ 127.4 │ 126.5 │ -0.72% │ └─────────────┴─────────────────────────────────────┴────────────────────┴───────────────────┴──────────┘

davidhewitt · 2024-11-01T09:28:41Z

tests/validators/test_allow_partial.py

+
+
+def test_tuple_list():
+    """Tuples don't support partial, so behaviour should be disabled."""


Why not? At least for variadic tuples this seems potentially acceptable.

I definitely want to support it in future, I was just implementing it for the minimum set of validators to prove the idea initially.

tests/emscripten_runner.js

davidhewitt · 2024-11-01T09:30:38Z

src/input/return_enums.rs

-    for (index, item_result) in iter.enumerate() {
+
+    for (index, is_last_partial, item_result) in state.enumerate_last_partial(iter) {
+        state.allow_partial = is_last_partial && validator.supports_partial();


Why is the .supports_partial() guard necessary (here and in general)?

because if you enabled allow_partial for a validator which doesn't have proper support for the feature, you would end up skipping errors in the wrong places.

E.g. if you had list[tuple[list[int], list[str]]:

the outer list validator supports allow_partial, so it'll ignore errors in the last entry in the input list

but if it passed allow_partial=true down to the tuple validator (which doesn't current support) allow_partial

it would pass state through to both inner list validators unchanged

then the list validator associated with the 0th entry in the tuple would get allow_partial=true

and therefore errors in the last entry of the 0th member of the tuple in the last entry of the outer list would be ignored when they shouldn't be

It feels to me like the "doesn't support partial" is the exception rather than the rule, and probably only for non-variadic tuples? Otherwise, I would have thought that:

collections would always want to respect allow_partial when passed in (forwarding it to their last element)

everything else e.g. with_default etc should not even care about allow_partial and just forward it naively

src/validators/dict.rs

davidhewitt · 2024-11-01T09:32:53Z

src/validators/dict.rs

                Err(ValError::LineErrors(line_errors)) => {
-                    for err in line_errors {
-                        errors.push(err.with_outer_location(key.clone()));
+                    if !is_last_partial {


Ok I'm feeling very baited to refactor these error combiners now 😂

Opened #1517 as a spike to start down that road.

davidhewitt

I definitely want to support it in future, I was just implementing it for the minimum set of validators to prove the idea initially.

I think we need it to be approximately right for most validators to actually be worth inviting users to try, and I'm not sure it is at the moment.

davidhewitt · 2024-11-01T16:52:16Z

src/input/input_python.rs

+            Self::Dict(dict) => dict.keys().iter().last(),
+            Self::Mapping(mapping) => mapping.keys().ok()?.iter().ok()?.last()?.ok(),


.call_method0("keys").iter().last() might be better (Python keys iterator rather than list), but I think it still sucks. Can't get better until we iterate, as you say.

davidhewitt · 2024-11-01T16:56:14Z

src/input/return_enums.rs

-    for (index, item_result) in iter.enumerate() {
+
+    for (index, is_last_partial, item_result) in state.enumerate_last_partial(iter) {
+        state.allow_partial = is_last_partial && validator.supports_partial();


It feels to me like the "doesn't support partial" is the exception rather than the rule, and probably only for non-variadic tuples? Otherwise, I would have thought that:

collections would always want to respect allow_partial when passed in (forwarding it to their last element)

everything else e.g. with_default etc should not even care about allow_partial and just forward it naively

davidhewitt · 2024-11-01T16:56:54Z

tests/validators/test_allow_partial.py

It feels like there's very common use cases missing from here:

unions

nullable

defaults

I think validators fit into three cases:

already done

doesn't apply - things like IntValidator where allow_partial doesn't mean anything

TODO simple cases - things like nullable where AFAIK it's just a matter of passing the allow_partial instruction to its nested validator - that's my understanding of these but maybe it's more complicated

TODO - collections or multiple nested validators where we need custom logic

I think validators break down like this:

Validator Status

TypedDictValidator DONE

UnionValidator TODO simple case?

TaggedUnionValidator TODO simple case

NullableValidator TODO simple case

ModelValidator doesn't apply

ModelFieldsValidator TODO

DataclassArgsValidator TODO

DataclassValidator doesn't apply

StrValidator doesn't apply

StrConstrainedValidator doesn't apply

IntValidator doesn't apply

ConstrainedIntValidator doesn't apply

BoolValidator doesn't apply

FloatValidator doesn't apply

ConstrainedFloatValidator doesn't apply

DecimalValidator doesn't apply

ListValidator DONE

SetValidator DONE

TupleValidator TODO

DictValidator DONE

NoneValidator doesn't apply

FunctionBeforeValidator TODO simple case?

FunctionAfterValidator TODO simple case?

FunctionPlainValidator TODO simple case?

FunctionWrapValidator TODO simple case?

CallValidator TODO simple case?

LiteralValidator doesn't apply

IntEnumValidator doesn't apply

StrEnumValidator doesn't apply

FloatEnumValidator doesn't apply

PlainEnumValidator doesn't apply

AnyValidator doesn't apply

BytesValidator doesn't apply

BytesConstrainedValidator doesn't apply

DateValidator doesn't apply

TimeValidator doesn't apply

DateTimeValidator doesn't apply

FrozenSetValidator DONE

TimeDeltaValidator doesn't apply

IsInstanceValidator doesn't apply

IsSubclassValidator doesn't apply

CallableValidator doesn't apply

ArgumentsValidator TODO

WithDefaultValidator TODO simple case

ChainValidator TODO simple case?

LaxOrStrictValidator TODO simple case?

GeneratorValidator TODO

CustomErrorValidator TODO simple case?

JsonValidator TODO

UrlValidator doesn't apply

MultiHostUrlValidator doesn't apply

UuidValidator doesn't apply

DefinitionRefValidator TODO

JsonOrPython TODO simple case?

ComplexValidator doesn't apply

If that table is correct, we should set the default for supports_partial to true, then set it to false specifically for fields which don't support it. But I'm still not that confident I'm write about all those cases.

I got rid of supports_partial completely, and instead set state.allow_partial = false on the few validators which don't support it yet.

samuelcolvin · 2024-11-02T18:03:52Z

please review

davidhewitt

This looks good to me now, I think this is the right default and makes it explicit where we have decided not to support yet 👍

src/serializers/fields.rs

Co-authored-by: David Hewitt <[email protected]>

samuelcolvin added 2 commits October 30, 2024 18:58

add "allow_partial" support

2bf9d9b

tests for set,frozenset,list,dict,typed_dict

554ec40

samuelcolvin force-pushed the allow_partial branch from 5af9b88 to 554ec40 Compare October 30, 2024 18:59

fix linting

6cd985d

samuelcolvin added 7 commits October 30, 2024 19:09

revert formatting changes

b5b2332

fix js tests and benchmarks

45ef7d5

support partial JSON parsing

92e694f

move allow_partial to state, and set it correctly

470754d

require validators to support partial

bd127eb

try to fix js tests

417ad13

fix tests

0d453a0

samuelcolvin mentioned this pull request Oct 31, 2024

Add experimental_allow_partial support pydantic/pydantic#10748

Merged

samuelcolvin added 3 commits October 31, 2024 16:32

support typeddicts where not all fields are not required

ee72ba9

tweaks

132f757

uprev jiter

f28aaba

sydney-runkle reviewed Nov 1, 2024

View reviewed changes

davidhewitt reviewed Nov 1, 2024

View reviewed changes

davidhewitt mentioned this pull request Nov 1, 2024

begin refactoring ValLineError collection #1517

Open

4 tasks

samuelcolvin added 2 commits November 1, 2024 12:47

removee Box dyn by always using EnumerateLastPartial

47f1c15

tweak comments

e220710

davidhewitt requested changes Nov 1, 2024

View reviewed changes

pydantic-hooky bot added awaiting author revision labels Nov 1, 2024

pydantic-hooky bot assigned samuelcolvin Nov 1, 2024

samuelcolvin added 2 commits November 2, 2024 17:56

switch defautl for supports_partial

a0936cd

support nested JSON validators

6478dba

pydantic-hooky bot added the ready for review label Nov 2, 2024

pydantic-hooky bot removed the awaiting author revision label Nov 2, 2024

pydantic-hooky bot assigned sydney-runkle and unassigned samuelcolvin Nov 2, 2024

samuelcolvin added 2 commits November 2, 2024 18:32

switch last_key method for mapping

5a521a8

remove support_partial

327083c

This was referenced Nov 3, 2024

remove remaining state.allow_partial = false; #1523

Open

make CLI arguments more concise 15r10nk/inline-snapshot#123

Closed

davidhewitt approved these changes Nov 4, 2024

View reviewed changes

src/serializers/fields.rs Outdated Show resolved Hide resolved

samuelcolvin and others added 2 commits November 4, 2024 10:47

accept @davidhewitt's suggestion

833ddfb

Co-authored-by: David Hewitt <[email protected]>

Merge branch 'main' into allow_partial

b1226d5

samuelcolvin enabled auto-merge (squash) November 4, 2024 10:51

samuelcolvin merged commit a1fa596 into main Nov 4, 2024
28 checks passed

samuelcolvin deleted the allow_partial branch November 4, 2024 10:55

vasudev-gm mentioned this pull request Dec 9, 2024

Bump pydantic-core from 2.20.1 to 2.27.1 vasudev-gm/magika_demo#65

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `allow_partial` #1512

Add `allow_partial` #1512

samuelcolvin commented Oct 30, 2024 •

edited by pydantic-hooky bot

Loading

codspeed-hq bot commented Oct 30, 2024 •

edited

Loading

sydney-runkle left a comment

sydney-runkle Oct 31, 2024

sydney-runkle Oct 31, 2024

sydney-runkle Oct 31, 2024

sydney-runkle commented Nov 1, 2024

davidhewitt left a comment

davidhewitt Nov 1, 2024

samuelcolvin Nov 1, 2024

davidhewitt Nov 1, 2024

samuelcolvin Nov 2, 2024

davidhewitt Nov 1, 2024

samuelcolvin Nov 1, 2024

samuelcolvin Nov 1, 2024

davidhewitt Nov 1, 2024

samuelcolvin Nov 1, 2024

davidhewitt Nov 1, 2024

samuelcolvin Nov 1, 2024

davidhewitt Nov 1, 2024

davidhewitt Nov 1, 2024

davidhewitt Nov 1, 2024

davidhewitt left a comment

davidhewitt Nov 1, 2024

davidhewitt Nov 1, 2024

davidhewitt Nov 1, 2024

samuelcolvin Nov 2, 2024 •

edited

Loading

samuelcolvin Nov 2, 2024

samuelcolvin Nov 3, 2024

samuelcolvin commented Nov 2, 2024

davidhewitt left a comment

		from pydantic_core import SchemaValidator, ValidationError, core_schema


		def test_list():

		Self::Dict(dict) => dict.keys().iter().last(),
		Self::Mapping(mapping) => mapping.keys().ok()?.iter().ok()?.last()?.ok(),



		def test_tuple_list():
		"""Tuples don't support partial, so behaviour should be disabled."""

Validator	Status
`TypedDictValidator`	DONE
`UnionValidator`	TODO simple case?
`TaggedUnionValidator`	TODO simple case
`NullableValidator`	TODO simple case
`ModelValidator`	doesn't apply
`ModelFieldsValidator`	TODO
`DataclassArgsValidator`	TODO
`DataclassValidator`	doesn't apply
`StrValidator`	doesn't apply
`StrConstrainedValidator`	doesn't apply
`IntValidator`	doesn't apply
`ConstrainedIntValidator`	doesn't apply
`BoolValidator`	doesn't apply
`FloatValidator`	doesn't apply
`ConstrainedFloatValidator`	doesn't apply
`DecimalValidator`	doesn't apply
`ListValidator`	DONE
`SetValidator`	DONE
`TupleValidator`	TODO
`DictValidator`	DONE
`NoneValidator`	doesn't apply
`FunctionBeforeValidator`	TODO simple case?
`FunctionAfterValidator`	TODO simple case?
`FunctionPlainValidator`	TODO simple case?
`FunctionWrapValidator`	TODO simple case?
`CallValidator`	TODO simple case?
`LiteralValidator`	doesn't apply
`IntEnumValidator`	doesn't apply
`StrEnumValidator`	doesn't apply
`FloatEnumValidator`	doesn't apply
`PlainEnumValidator`	doesn't apply
`AnyValidator`	doesn't apply
`BytesValidator`	doesn't apply
`BytesConstrainedValidator`	doesn't apply
`DateValidator`	doesn't apply
`TimeValidator`	doesn't apply
`DateTimeValidator`	doesn't apply
`FrozenSetValidator`	DONE
`TimeDeltaValidator`	doesn't apply
`IsInstanceValidator`	doesn't apply
`IsSubclassValidator`	doesn't apply
`CallableValidator`	doesn't apply
`ArgumentsValidator`	TODO
`WithDefaultValidator`	TODO simple case
`ChainValidator`	TODO simple case?
`LaxOrStrictValidator`	TODO simple case?
`GeneratorValidator`	TODO
`CustomErrorValidator`	TODO simple case?
`JsonValidator`	TODO
`UrlValidator`	doesn't apply
`MultiHostUrlValidator`	doesn't apply
`UuidValidator`	doesn't apply
`DefinitionRefValidator`	TODO
`JsonOrPython`	TODO simple case?
`ComplexValidator`	doesn't apply

Add allow_partial #1512

Add allow_partial #1512

Conversation

samuelcolvin commented Oct 30, 2024 • edited by pydantic-hooky bot Loading

codspeed-hq bot commented Oct 30, 2024 • edited Loading

Merging #1512 will degrade performances by 16.62%

Summary

Benchmarks breakdown

sydney-runkle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sydney-runkle commented Nov 1, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidhewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelcolvin Nov 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samuelcolvin commented Nov 2, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

Add `allow_partial` #1512

Add `allow_partial` #1512

samuelcolvin commented Oct 30, 2024 •

edited by pydantic-hooky bot

Loading

codspeed-hq bot commented Oct 30, 2024 •

edited

Loading

samuelcolvin Nov 2, 2024 •

edited

Loading